Halo clustering and e/jvi-type primordial non-Gaussianity 



Kendrick M. Smith 1 , Simone Ferraro 1 , and Marilena Lo Verde 2 



x 



Abstract 



1 Princeton University Observatory, Peyton Hall, Ivy Lane, Princeton, NJ 08544 USA 
2 Institute for Advanced Study, Einstein Drive, Princeton, NJ 08540, USA 

O 

a 

3 

A wide range of multifield inflationary models generate non-Gaussian initial conditions in which 
the initial adiabatic fluctuation is of the form ((g + 9nlCg)- We study halo clustering in these 
models using two different analytic methods: the peak-background split framework, and brute force 
calculation in a barrier crossing model, obtaining agreement between the two. We find a simple, 
theoretically motivated expression for halo bias which agrees with iV-body simulations and can be 
used to constrain g^L from observations. We discuss practical caveats to constraining g^L using only 
observable properties of a tracer population, and argue that constraints obtained from populations 
, whose observed bias is < 2.5 are generally not robust to uncertainties in modeling the halo occupation 

distribution of the population. 
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1 Introduction 



In the last few decades, increasingly precise observations (e.g. [1-6]) have led to a standard cosmo- 
logical model in which small initial fluctuations evolve in a ACDM background to give rise to the 
observed universe. Current data are consistent with initial fluctuations which are adiabatic, scalar, 
Gaussian, with weak deviations from scale invariance (n s < 1 at 3a). 

The statistics of the initial fluctuations, i.e. deviations from Gaussian initial conditions, provide 
a powerful probe of the physics of the early universe. In the context of inflation [7-13], the sim- 
plest models (single-field, minimally coupled slow-roll) predict initial curvature perturbations with 
negligible deviations from Gaussianity. However, there is a rich phenomenology of non-Gaussian 
initial conditions in models with multiple fields, self-interactions near horizon crossing, or speed of 
sound c s <C 1 during inflation. In this paper, we will focus on non-Gaussianity of the so-called local 
type [14-17], in which the primordial potential 1 is of the form 

$(x) = $ G (x) + f NL ($ G (x) 2 - <$ G » + <Mtl(<M*) 3 - 3(^)$ G (x)) (1) 

where $g is a Gaussian field and /jvx, 9nl are free parameters. 2 

Local non-Gaussianity can be generated by physical mechanisms involving multiple fields, such 
as light spectator fields during inflation which evolve to generate the initial adiabatic fluctuations 
(the curvaton scenario) [18-20], or models where the inflaton decay rate is modulated by a second 
field [21,22]. Non-Gaussianity of local type is also naturally generated in non-inflationary models 
of the early universe such as the new ekpyrotic/cyclic scenario [23-25]. There is a theorem [26,27] 
which states that any single-field model of inflation cannot generate detectable levels of local non- 
Gaussianity without violating observed limits on deviation from a scale-invariant power spectrum. 
Thus, detection of either or g^i would rule out all single field models of inflation and place 
powerful constraints on the physics of the early universe. Current observational constraints on these 
parameters are consistent with zero [1,28-30], but are expected to improve by an order of magnitude 
or more in the near future. 

In models of inflation in which \qnl\ = C(/jvx)) it is unlikely that observational constraints 
on c/jvl will be competitive with constraints on Jnl- However, there are a number of examples 
where f^ L <C |<7jvl|, making the qnl term in Eq. (1) the dominant source of primordial non- 
Gaussianity. This situation arises in curvaton models where non-quadratic terms in the potential are 
important [31-35] or in multifield models in which (AiV) varies rapidly at the end of inflation [36,37]. 
The existence of these scenarios makes searching for g^i just as important as //vx and measurements 
provide important constraints on the microphysical parameter space. 

In a pioneering paper [38] , Dalai et al showed that large-scale clustering of halos depends sen- 
sitively on /jvl- More precisely, a sample of halos (or tracers such as galaxies or quasars) with 

In studies of primordial non-Gaussianity, it is conventional to define a primordial potential $ = |f , where £ is the 
initial adiabatic curvature perturbation. Note that $ is not the conformal Newtonian potential, which is given by |$ 
deep in the radiation-dominated epoch where Eq. (1) applies. 

2 We define pjvi-type non-Gaussianity including the term — 3($ g )$g; this term simply renormalizes $g so that its 
power spectrum P$ G is equal to the observed power spectrum P$ (to first order in <7jvl)- 
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constant bias 61 in a Gaussian cosmology will have scale-dependent bias given by 



b(k) » 61 + 25 c (h - 1) 



Inl 



(2) 



a(k, z) 



in an /jvl cosmology. Here, <5 C is the spherical collapse threshold and a(fc, z) is defined by 



a(fe, z) = 



2k 2 




(3) 



so that the linear density field and the primordial potential are related by S^ n (k,z) = a(k, z)Q(k). 
Large-scale structure constraints on /jvl from scale-dependent bias are currently competitive with 
the CMB (e.g. [28,39]) and may ultimately provide constraints which are stronger (e.g. [40,41]). The 
key identity (2) has been derived using several different analytic frameworks [28,42,43] and agrees 
with iV-body simulations (e.g. [30,38,44,45]). 

In this paper we study the related issue of large-scale halo-clustering in a g^L cosmology. We 
consider the large-scale halo bias in two analytic frameworks: the peak-background split (§3) and 
a barrier crossing model (§4). We find consistency between the two formalisms (in disagreement 
with [46]) and obtain an expression analogous to Eq. (2) for the scale-dependent halo bias in a g^L 
cosmology. Our main results are a universal relation between the scale-dependent halo bias in a g^L 
cosmology and the mass function in an /jvl cosmology, 



and expressions for f} g (Eqs. (50), (51)) which can be used in practice to constrain g^L from data. We 
also discuss caveats when estimating the g^L bias from observable quantities (§5.4) and argue that 
constraints obtained from tracer populations which are not highly biased (61 > 2.5) are generally 
not robust to uncertainties in HOD modeling. 

Throughout this paper we use the WMAP5+BAO+SN fiducial cosmology [47], with baryon 
density Vt^h 2 = 0.0226, CDM density f2 c /i 2 = 0.114, Hubble parameter h = 0.70, spectral index 
n s = 0.961, optical depth r = 0.080, and power-law initial curvature power spectrum k s P^(k) /2ir 2 = 
A^fc/fcpiv)™ 3 " 1 where A 2 = 2.42 x 10 -9 and k piv = 0.002 Mpc -1 . All power spectra and transfer 
functions have been computed using CAMB [48]. 

2 Definitions and notation 

We will sometimes model halos of mass > M with peaks in a smoothed density field 5m defined 
as follows. Let <5m(x) be the linear density field smoothed by a tophat filter with radius R(M) = 



b(k) « 61 + 



fig9NL 

a(k, z) 



where (3 g = 3(dlogn/df NL ) 



(4) 



(3M/4 7 rp m ) 1 / 3 , i.e. 




(5) 



where 



Let om = (<^m) 1//2 be the RMS amplitude of the smoothed density field, and let K n (M) be its n-th 
non-Gaussian cumulant, defined by: 

Kn (M) = (5 % n ^ . (7) 

Since 5m and cjm are defined via linear theory, n n {M) is independent of redshift as implied by the 
notation. To first order in /jvl and gNL, we have 

k 3 (M) = 4\M)f NL (8) 
k 4 (M) = 4\M)g NL (9) 

with higher cumulants equal to zero, where k£\m),k£\m) are the values of the cumulants at 
/jvl = 1 and 5 jvl = 1 respectively. These values are given explicitly by: 

73 1, 



4 j w = -^^^(^(fc^k+ki) ___ (io) 

4\m) = ji f d3k f \f k '' w M (k)w M (k')w M (k'')w M (\\c + k' + k"|) 

U^MMH + k/ + k "i) nn 

X a(k)a(k')a{k") { ' 

where a(k) was defined previously in Eq. (3) and P mm (k) is the power spectrum of the linear 
density field, (5ii n (k)(5ii n (k / )) = (27r) 3 P mm (/c)(5^ 3 )(k + k'). For numerical calculation, the following 
fitting functions (from [49]) are convenient: 



4\M) = (6.6 xl(T 4 )(l -0.016 log (12) 
4\M) = (1.6 xlO- 7 ) (l-0.021 log ■ (13) 

This paper is mainly concerned with calculating halo bias b(k) = P m h(k) / P mm (k) to first order in 
/jvl and gNL, so let us establish notation from the outset, by writing the large-scale bias in the 
general form: 

b(k) = b 1 + b lf f NL + b lg9NL + PffNL ^ 99NL (14) 

a{k) 

where unlike Eq. (2) and Eq. (4) we have allowed for scale-independent corrections b\ / and b\ g from 
/jvl and gjvz, primordial non-Gaussianity. Equation (14) defines the coefficients b±, b±f, b\ g , f3f, f5 g . 
This equation assumes that the fc-dependence is of the functional form (constant) + (constant) /a (k), 
but we will derive this analytically (Eq. (35)) and show that it agrees with simulations (§5.1). In 
this notation, the Dalai et al formula (2) can be written as j3f = 2S c (bi — 1). 



3 Peak-background split 

The peak-background split formalism is a procedure for predicting halo clustering statistics on 
large scales. The basic idea is that a long-wavelength fluctuation in the initial curvature alters the 
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local abundance of halos in a way which is equivalent to perturbing parameters of the background 
cosmology, e.g. the matter density p m or the amplitude A$ of the initial fluctuations. The use of 
this formalism to study halo bias in non-Gaussian cosmologies was pioneered in [28]; we will review 
this calculation of the bias in an /jvl cosmology (§3.1) and then perform an analogous calculation 
in the g^L case (§3.2). 



3.1 f NL cosmology 

In an /jvl cosmology, the initial conditions are given by: 

$(x) = $ G (x) + / iVL (cD G (x) 2 - (d>|)) (15) 

To analyze the effect of a long- wavelength mode, let us decompose the Gaussian potential as a sum 
<3?G* = &i + & s of long-wavelength and short-wavelength contributions. The long/short-wavelength 
decomposition of the non-Gaussian potential $ is then 

$(x) = $,(x) + f NL (^(x) 2 - <$ 2 » + (1 + 2f NL <5> l (x))$ s (x) + W$ s (x) 2 - (16) 

" v ' " v ' 

long short 

and contains explicit coupling between long and short wavelength modes of the Gaussian potential. 

Let us consider how the term (l + 2/jvx<l>/(x))<l> s (x) in Eq. (16) affects n/(x), the long- wavelength 
part of the halo number density field. In a local region where the long-wavelength potential takes 
some value the amplitude A<j> of the small-scale modes is perturbed: A$ — > (l+2fNL$i)A$. This 
modifies the local halo abundance, in the same way that the global abundance would be modified 
if the cosmological parameter A$ were perturbed, i.e. we get a term in the long-wavelength halo 
density of the form An(x) = 2/7VL < £z( x )(<9n/<91og A$). In addition, even in a Gaussian cosmology, 
there is a perturbation to the local halo abundance which is proportional to the long-wavelength 
part <5/(x) of the density fluctuation, i.e. a term of the form An(x) = 5i(x)(dn/d5i). Putting this 
together, the long-wavelength part of the halo density is given by: 3 

n,(x) = n+^,(x) + 2/ JVL ^-* l (x) 

= n(l + 6i(Si(x) + /0//iVL*j(x)) (17) 

where 

- (18) 

* ~ <»> 

3 In this derivation, we have swept two terms in Eq. (15) under the rug; let us now argue that these are indeed 
negligible. The term /jvL(3> 3 (x) 2 — ($s)) alters the statistics of the small scale modes; this does perturb the halo 
abundance (by generating skewness in the density field) but the perturbation is independent of the long-wavelength 
fluctuation Therefore, this term does not contibute to the large-scale halo bias. The term /jvl($;(x) 2 — ($ 2 )) 
perturbs the long- wavelength modes and decorrelates them (to order 0(/jvz,)) from both the linear density fluctuation 
<5(x) and the field which modulates the local power spectrum amplitude A$. In principle, this should 

generate stochastic bias at order 0(f% L ), but we will neglect this, since we are only calculating to order C(/jvz,). 
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Intuitively, in an f^L cosmology, the local power spectrum amplitude A<j> is not spatially constant, 
but varies throughout the universe in a way which is proportional to the long-wavelength potential 

Computing the halo bias b(k) = P m h(k) / Pmm(k) from Eq. (17) for n/(x), we get: 

, m biP mm (k) + (3 f P m $(k) 

°w — p 

Jr mm(k) 

From the preceding argument, we predict that the scale-dependent Jnl bias is given by /3f = 
2(<91ogn/<91og A$). We will refer to this as a "weak" prediction for the bias: it cannot be used to 
constrain Jnl from real data, since /?/ has not been expressed in terms of observable quantities. 

To make further progress, we need to evaluate the derivative (<91ogn/<91og A$), by making 
additional assumptions. If we assume that the halo mass function is universal, then one can calculate 
the derivative, obtaining (<91ogn/<91og A<j>) = 8 c (b\ — 1), where b\ is the Gaussian bias [28], so that: 

(3 f = 25 c (h - 1) . (21) 

We will refer to this as a "strong" prediction for the scale-dependent bias in an /jvl cosmology, 
since /?/ has been expressed in terms of the observable quantity b±. The strong form is essential for 
constraining f^^ from observations. 

3.2 Qnl cosmology 

Let us now generalize the analysis of large-scale clustering in the previous subsection to the case of 
a 9nl cosmology, with initial conditions given by: 

$(x) = <D G (x) + g NL ($ G (xf - 3(<&|)<Mx)) • (22) 

Separating the Gaussian field into long and short wavelength pieces &g = $i + $ s , we decompose 
<3? as follows: 

<&(x) = <D z (x) + ^ L (^(x) 3 -3(^)<I> z (x)) (23) 

N v ' 

long 

+ $ s (x) + 3g NL (^f - ($ 2 ))$ s (x) + 3<7^ L ^(x)($,(x) 2 - ($ 2 » + g N L(M^f ~ 3(^)$ s ( x )) 

V «, ' 

short 

As in the /nl case, we'll consider the perturbation to the long- wavelength halo density n^(x) 
generated by each of these terms. 

The term 3pjvx(3>z(x) 2 — ( < ^ > f)) < ^s(x) can be interpreted as a local modulation in the small-scale 
power spectrum amplitude, given by A<j> — > (1 + 3gNL(&i(x-) 2 — (&f)))A$. This generates a term 
An^(x) = 3<77VL( < ^ > i( x ) 2 ~~ (&f))(dn/dlog A$) in the long-wavelength halo density, in close analogy 
with the /tvl case (the modulation is proportional to gNL(^f — (^ 2 )) in this case, rather than /nl^i)- 

The term 3gNL&i('X-)(& s ('X-) 2 — can be interpreted as follows. In a local region where the 

long- wavelength potential takes the value the small-scale modes are perturbed in the same way 
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as in an f^L cosmology where the global value of Jnl is given by (SgNL^i)- This generates a term 
An^(x) = 3gNL<&i(-x)(dn/dfNh) in the long-wavelength halo density. 

Finally, there is the usual term An;(x) = 5i(x)(dn/d5i) due to changes in mean background 
density (as in the Gaussian case). 

Putting this all together, we find that the long-wavelength halo density field in a g^L cosmology 
is given by: 4 

ni{y) = fi+—5 l {*)+Zg NL (^(x) 2 - ($?)) + 3g NL —- $/(*) 

ddi a log A$ df NL 

= n (\ + MjOO + ^/3 /5JVL (<I>Kx) 2 - <<I> 2 » + ^^(x)) (24) 

where 61 and /?/ were defined previously (Eqs. (18), (19)), and: 

ft =3^ (25) 

The large-scale halo bias b(k) = P m h{k) / P m m(k) is given by: 

b(k) = b 1 + . (26) 

a(k, z) 

Note that the (/3/<7atl) term in Eq. (24) does not contribute to the bias, since the field ($^(x) 2 — (<& 2 )) 
and the long-wavelength density field 5i are uncorrelated (their cross correlation is a three-point 
function of Gaussian fields, which vanishes). This term should generate stochastic bias, but we defer 
a systematic study of halo stochasticity in non-Gaussian cosmologies to a future paper [50]. 

We have now arrived at the peak-background split prediction (26) for halo bias in a g^L cos- 
mology, which relates the scale-dependent g^L bias to the derivative (dlogn/dfNL) of the halo 
mass function in an /jv l cosmology. In the terminology of the previous subsection, this is a "weak" 
prediction: we have shown that the problem of computing the g^L bias is naturally related to the 
problem of understanding the mass function in an f^i cosmology, but the coefficient f3 g has not 
been expressed in terms of observable quantities. 

To obtain a "strong" prediction, we need to evaluate the derivative (dlogn/dfNL), which requires 
making additional assumptions. This has been done in [49], assuming a barrier crossing model for 
the mass function and using the Edgeworth expansion to calculate the derivative (see also [51-57]). 
The result is: 

df NL 6 iK y " 6 dv/dM y ' 

where v = 5 c /um, and H2(x) = x 2 — 1 and H%{x) = x 3 — 3x are Hermite polynomials. We will 
compare this prediction with iV-body simulations in §5. 



4 Analogously to the Jnl case, we have neglected two terms in Eq. (23). The term givz / ($i(x) 3 — 3($f)$;(x)) only 
alters power spectra at order 0{g% L ), and we will neglect terms of this order. The term <;jvl($s(x) 3 — 3(<E > s) ( I > s (x)) 
generates kurtosis in the density field and modifies the halo mass function [49], but in a way which is independent of 

and therefore does not contribute to large-scale clustering. 
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4 Barrier crossing model 



In this section, we will study large-scale bias using a barrier crossing model, obtaining results which 
are consistent with the peak-background split formalism from the previous section. The two ap- 
proaches are complementary: the barrier model has the advantage that it generates complete pre- 
dictions for halo statistics (such as the mass function or bias) via an algorithmic calculational 
procedure, but obscures the physical intuition of the peak-background split. For completeness, the 
calculations in this section will be sufficiently general to include the cases of Gaussian, /;vx-type, 
and <?7VL-type initial conditions. 



4.1 Setting up the calculation 

The barrier crossing model is an old, widely influential idea in cosmology, in which halos of mass 
> M are identified with peaks in the smoothed linear density field [58]. Although more complex 
versions have been proposed, we will use the simplest version: a spherical collapse model with 
constant barrier height, defined formally as follows. 

We model halos of mass > M as regions where the smoothed linear density field 5m ( x ) (defined 
in Eq. (5)) exceeds the threshold value S c , i.e. the halo number density n^(x) is given by: 

«h(x) = ^0(<Mx) - <J C ) (28) 

where 6 is the step function 

"(•<•) - <! : ::: ^ 




Throughout this paper, we take 5 C = 1.42; this value produces somewhat improved agreement 
between the barrier model and simulations, compared to the Press-Schechter value 5 C = 1.69. 5 

To study halo bias in this model, we define the following notation. Let x, x' be two points 
separated by distance r, let <5ii n denote the (unsmoothed) linear density field at x, and let S' M denote 
the smoothed linear density field at x'. We denote the joint PDF of these random variables by 
p(5im, ^'m)i an d denote the 1-variable PDF of S' M by p(S' M ). We define 

poo 

Po = / dS' M p(S' M ) (30) 

£o(r) = J d5i in d5' M p(5 hn ,5' M )5 hn 9(5' M - 8 C ) (31) 

These quantities are related to the halo mass function n(M) and matter-halo correlation function 
£,mh{ r ), but there is one wrinkle. In the barrier crossing model, the field defined in Eq. (28) 
represents the number density of halos with mass > M, whereas we want to consider a sample of 
halos with mass in a narrow mass range (M,M + dM). Thus n(M) and £ m h{r) are obtained by 



5 We experimented with using a mass-dependent barrier 5 c (v) chosen for consistency with a universal mass function 
such as Sheth-Tormen [59] or Warren [60], but found that this did not result in further improvement. 
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2pm / dpo\ 



taking derivatives as follows: 

n <M) = ( ^ ) (32) 

V ' M \dMJ y ' 

dUr)/dM 

" r) = d P0 /dM (33) 
4.2 Mass function, halo bias, and interpretation 

In principle, calculation of the halo mass function and large-scale bias in the barrier crossing model 

has now been reduced to evaluation of Eqs. (30)-(33). We defer details of the calculation to Ap- 
pendix A and quote the final results. The halo mass function is given by: 



n(M) W<*log^ e->* 



M V dM J (27r)V2 



The halo bias b(k) = P m h(k) / P mm (k) is given by (in the large-scale limit k — >■ 0): 



(34) 



b(k) = b 1 + b lffNL + b lg9NL + W» L +J>» 9NL (35) 



where: 



v 2 



bi = 1 + ^— (36) 

0c 



h, = -4'w(— ) + d(l08 3 g -V M (— ) 

** - - K " (M) (~^) + ^ogV')/^ (i2i;) < 38 > 

/?/ = 2^ 2 -2 (39) 

« - rWrjlf) 173 " 317 ^ /dM ( ^"^ (40) 
Pa ~ 3( j 2 d(lo g< 7-i)/dM ^ 2 J (4Uj 

Although the above expressions are the result of a purely formal calculation, we will now show that 
each term has a natural interpretation. 

Considering first the halo mass function (34), we have found a Press-Schechter mass function 
(with 5 C = 1.42) in the Gaussian case, plus first-order corrections in /jvl and gpjh which agree with 
those found in [49, 52] using the Edgeworth expansion. This agreement is expected since the two 
calculations are based on the same barrier crossing model. 

Moving on to halo bias, in the Gaussian case, we predict that b(k) is constant on large scales, 
with value b\ given by Eq. (36). The peak-background split argument suggests a general relation 
between the large-scale halo bias and the halo mass function which applies generally to a universal 
mass function of the form: 

n(M) = P ^f(v) dl ° ga ~ i (41) 
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On large scales, the bias is predicted to be scale-independent and given by [61]: 



Comparing our predictions (34), (36) for n(M) and b\, we find agreement, i.e. Eq. (36) for b\ 
can be interpreted as the general peak-background split expression for halo bias, specialized to the 
Press-Schechter mass function. 

More generally, the b\f and b\ g contributions to the bias (Eqs. (37), (38)) represent shifts in the 
scale-independent part of the bias due to primordial non-Gaussianity. It is straightforward to check 
that these terms can be obtained by plugging the non-Gaussian mass function in Eq. (34) into the 
peak-background split prediction (42) for scale-independent bias, i.e. the b\f and b\ g terms can be 
interpreted as changes to the bias which are entirely due to the mass function being perturbed in a 
non-Gaussian cosmology. This type of term (scale-independent bias proportional to /nl) was first 
found for /jvl cosmologies in [30]. Note that a scale- independent shift is unobservable in practice, 
and cannot be used to constrain non-Gaussianity, since the bias of a real tracer population, such as 
galaxies or quasars, is a free parameter. 

The Pf contribution to the bias is the well-known scale-dependent bias in an Jnl cosmol- 
ogy. Comparing Eq. (39) for /3f with Eqs. (34), (36), this term can be written either as /3f = 
2<9(logn)/<9(log A<j>) or f3f = 2<5 c (&i — 1). (In §3, we referred to these as "weak" and "strong" 
predictions.) 

The f3 g contribution to the bias is the focus of this paper: scale-dependent bias in a g^L cos- 
mology. Eq. (40) gives this term in the "strong" form that was found previously (Eq. (27)) us- 
ing the peak-background split argument. Alternately, we can write this term in the "weak" form 
p g = 3d(logn)/df NL using Eq. (34). 

In summary, we have found that the complete expression for large-scale halo bias in the barrier 
crossing model (Eq. (35)) agrees perfectly with the peak-background split calculation from §3. The 
bias contains a scale-independent part (6i + + bigONL) which can be obtained from the 

halo mass function, via the general relation (42). The scale-independent bias depends on Jnl and 
§nl, because the halo mass function depends on these parameters. The bias also contains a scale- 
dependent part {(3ffNL + Pg9NL)/oi{k) whose coefficients can be calculated explicitly and agree with 
the peak-background split predictions. 



4.3 Comparison with previous work 

It is interesting to compare the above calculations with the results of [62] (see also [43]), where (3 g 
was calculated using the MLB formula [63] , which gives iV-point functions of halos as an asymptotic 
series in v. The scale-dependent g^L bias was found to be (in our notation): 

^ = 4\M) 5 -^^ (43) 

When this prediction was compared to iV-body simulations, it was found to be a poor fit. 

Comparing /3^ LB with our calculation (40) for j3 g , it is seen that the two agree in the high-mass 
limit v — > oo, but disagree in subleading terms. This is expected since the MLB formula is based on 
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the same barrier crossing model that we have used, but it is an asymptotic result, whereas we have 
done an exact calculation (to first order in Jnl, 9nl)- For realistic halo masses, the "subleading" 
terms neglected in the MLB formula are of order one (to quantify this better, /3 g and /3^ LB agree to 
10% only when the halo bias b\ > 15), so in practice the two predictions are quite different. 

Recently, ref. [46] argued that the barrier crossing model cannot generate correct predictions for 
general non-Gaussian initial conditions such as the qnl model, but we found the opposite conclusion: 
brute-force calculation in the barrier crossing model, collecting all terms of order O(gNL), agrees 
precisely (i.e. to all orders in with the peak-background split. It seems intuitively plausible that 
two must be consistent, since the peak-background split argument depends only on the assumption 
that halo formation is determined by the local density field, and the barrier crossing model is a 
concrete example of a model in which this assumption is satisfied. 

5 Results from iV-body simulations 

In the last two sections, we have obtained complete analytic predictions for large-scale bias in a 
9NL cosmology, finding agreement between the peak-background split formalism (§3) and a barrier 
crossing model based on spherical collapse (§4). 

To compare these predictions with simulation, we performed collisionless iV-body simulations 
using the GADGET-2 TreePM code [64] . Simulations were done using periodic box size i?box = 1600 

1 /3 

h^ 1 Mpc, particle count N p = 1024 3 , and force softening length R s = 0.05(i?box/^V )• With these 
parameters and the fiducial cosmology from §1, the particle mass is m p = 2.92 x 10 11 h^ 1 M@. 

We generate initial conditions by simulating a Gaussian primordial potential <£, and applying f^L 
or (Jnl corrections by straightforward use of Eq. (1). We linearly evolve to redshift Zj n i = 100 using 
the transfer function 6 from CAMB [48], and obtain initial particle positions at this redshift using 
the Zeldovich approximation [65]. (At z- m i = 100, transient effects due to use of this approximation 
should be negligible [66].) 

After running the A-body simulation, we group particles into halos using an MPI parallelized 

— 1/3 

implementation of the friends-of-friends algorithm [67] with link length Lfof = 0.2i?boxA p . For 
a halo containing Afof particles, we assign a halo position given by the mean of the individual 
particle positions. We estimate halo bias b(k) = P m h(k) / P mm (k) using the procedure described in 
Appendix A of [68]. The statistical error Ab(k) obtained using this procedure is smaller than the 
error that would be obtained assuming uncorrelated estimates of the power spectra P m m and P m h, 
since shared sample variance is taken into account. 

Results in this paper are based on 4 simulations with Gaussian initial conditions, 5 simulations 
with qn l = ±2 x 10 6 , and 3 simulations with /nl = ±250 (for a total of 20 simulations). 

6 One subtlety here: straightforward use of CAMB's transfer function at redshift 100 leads to inconsistencies since 
CAMB includes radiation (which is not negligible at z = 100) in its expansion history, while GADGET does not. For 
this reason we use CAMB's linear transfer function at low redshift and extrapolate back to z = 100 using the growth 
function in an r2 rac i — universs. 
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Figure 1: An example to illustrate that halo bias in a g^L cosmology takes the functional form 
form b(k) = b\ + (3 g gNL/o(k). This figure corresponds to redshift z = 0.5 and halo mass range 
1.15 < M < 1.83 x 10 14 h^ 1 Mq, but we find the same functional form for all redshifts and halo 
masses. 



5.1 Fitting the functional form b(k) = b± + (3 g gNL/ f oc(k) 

We now compare our analytic prediction for b(k) to simulation in several steps, corresponding to 
increasingly strong versions of the prediction. 

First, consider the weakest possible question: our analytic prediction for the bias is of the 
functional form 

6 « = il + ^lf) (44 > 

Is this is a good fit to simulation, if we treat the coefficients b\ and (3 g as free parameters? (We 
will compare our analytic prediction for f3 g to simulation in the next subsection; for now we are just 
testing whether the functional form (44) is correct.) 

In Fig. 1, we show some example fits of this form, for redshift z = 0.5 and halo mass range 
1.15 < M < 1.83 x 10 14 h- 1 M & . Each fit was performed using bias estimates from 4 independent 
simulations with Lbox = 1600 h" 1 Mpc and wavenumbers k < 0.04 h Mpc" 1 . We find good x 2 
values for the fits, with recovered parameters: 

bi = 3.653 ± 0.026 for g NL = 

(6i,10 3 /3 9 ) = (3.575 ±0.038, 0.581 ±0.056) for g NL = 2 x 10 6 (45) 

(6i, 10 3 /3 9 ) = (3.824 ± 0.039, 0.935 ± 0.060) for g NL = -2 x 10 6 

We note that the recovered bias parameters (45) in this example show that both b\ and j3 g are 
gTVL-dependent. In the barrier crossing model, we made a prediction for the gNL dependence of 
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b\ (Eq. (38)). We find good agreement between this prediction and our simulations. Note that 
in practice, the qnl dependence of b\ is unobservable since for a real tracer population, the halo 
occupation distribution is not known precisely and b\ must be treated as a free parameter to be 
determined from data. 

The observed g^L dependence of (3 g corresponds to scale-dependent bias of order 0(g NL ) or 
higher (note that (3 g is defined in such a way that constant j3 g corresponds to scale-dependent 
bias which is linear in gNL)- This complicates comparison with our analytic predictions, since we 
have only calculated the bias to order 0{gML}- We address this by estimating f3 g by averaging the 
estimates obtained from simulations with gNL = ±2 x 10 6 , thus removing contributions to b(k) 
which are proportional to g%L- Note that this does not remove 0(g NL ) contributions to the bias, 
but we have checked that such contributions are negligible for gNL = ±2 x 10 6 , by comparing with 
simulations with halved step size. 

Repeating this fitting procedure for redshifts z G {2, 1,0.5,0} and a range of halo masses (the 
precise set of halo mass bins used is shown in Fig. 2 below), we find \ 2 values which are consistent 
with their expected distribution, i.e. we find that the functional form (44) is a good fit to the 
simulations for a wide range of redshifts and halo masses. For this reason, in subsequent sections, 
we will "compress" the estimates of b(k) in each simulation (as shown in Fig. 1) to two numbers {b\ 
and (3 g ), with statistical errors given by the fitting procedure. 

5.2 Comparison with analytic predictions 

Now that we have established the functional form b{k) = b\+f3 g gNL/ ct{k) of the bias, and a procedure 
for estimating f3 g from simulation as a function of redshift and halo mass, we would like to compare 
with our analytic predictions for /3 g . 

First, consider the "weak" form of the prediction ((3 g = 3(9 log n/ O/nl)) obtained from the peak- 
background split argument. We can test this prediction by estimating the derivative (dlogn/df'NL) 
directly from simulations, by taking finite differences of log(n) in simulations with /nl = ±250. 
(We checked convergence in the step size.) We find that the prediction holds perfectly (within the 
statistical errors of the simulations) for all redshifts and halo masses (Fig. 2). 

Second, consider the "strong" Edgeworth prediction (Eq. (40)), in which an explicit formula for 
(3 g is given. In this case, we find reasonable agreement at high mass (M > 10 14 h~ l Mq), but the 
prediction breaks down at low halo mass (Fig. 2). 

Our interpretation is as follows. The peak-background split prediction j3 g = 3(dlogn/dfNL) is 
a universal relation between bias in a ijjvl cosmology and the mass function in an Jnl cosmology. 
Although "weak" in the sense that it does not supply a closed- form expression for f3 g , the derivation 
makes few assumptions, and one expects it to be exact. In order to constrain gNL from real data, 
we need a "strong" prediction which expresses (3 g in closed form, using only observable quantities 
(i.e. the analog of the Dalai et al formula (3f = 25 c (bi — 1) for an Jnl cosmology). Using the 
Edgeworth expansion, one can make such a prediction in the context of the barrier crossing model 
(Eq. (40)), and obtain rough agreement with simulations, but the level of agreement is not really 
good enough for doing precision cosmology. Therefore, we next propose a slightly modified version 
of the Edgeworth prediction. 
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Figure 2: Comparison of the "weak" and "strong" predictions for the scale-dependent bias in a 
9nl cosmology. Blue squares: Direct estimates of the bias, extracted from simulations with 
gNL = ±2 x 10 6 as described in §5.1. Green circles: "Weak" analytic prediction for the bias 
(f3 g = 3((91ogn/<9/7VL)) from the peak-background split formalism, showing perfect agreement. The 
estimates of (d log n/df^L) shown in the figure were obtained directly from simulations with f^L = 
±250. Red dotted curve: Edgeworth prediction for the bias (Eq. (40)). Good agreement is 
seen at high mass, but at low masses Edgeworth underpredicts ?>{d\ogn/ dfNL}- We will find an 
improvement in §5.3. 
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5.3 A simple universal formula for the bias in a g NL cosmology 

We would like to slightly modify the Edgeworth prediction (40) for /3 g so that it agrees better with 
iV-body simulations. It is also convenient to have a prediction in which f} g is given as a function of 
observable quantities: Gaussian bias b\ (rather than halo mass, which is unobservable) and redshift 
z. 

We start by rewriting the Edgeworth prediction (40) for (5 g in terms of variables (b±,z). The 
following fitting functions for K3 and / l d\og{a~ 1 ) are convenient: 



k 3 = 0.000329(1 + 0.09.z)&7 a09 (46) 
- = -0.000061(1 + 0.22z)6^ - 25 (47) 



d log a 

For purposes of this subsection, we define the quantity v to be given in terms of 61 and z by: 

v = [<5 c (6i - 1) + 1] 1/2 (where <5 C = 1.42) (48) 
The Edgeworth prediction for (3 g can be written in the following form: 

dK3 ( v — u^ 1 



.Edge. = 



«'3 



-l + ^-l) 2 + ^-l) 3 



d log a 1 



(49) 



Empirically, we find that if we tweak the Edgeworth prediction by changing the coefficients of the 
polynomial in brackets as follows: 



Pg = 



- 0.7 +1.4(z^-l) 2 + 0.6(i/ -1) 



d«3 ( V — V 



tiloger 



,-1 



(50) 



then we obtain good agreement with simulations (Fig. 3). The expression (50) for f3 g (with quantities 
K3, dus/dloga' 1 , v defined by Eqs. (46)-(48)) is one of the main results of this paper and is our 
observational "bottom line" when constraining g^i from real data. 

We have motivated our "tweak" to the Edgeworth prediction as essentially a fitting function 
for the v dependence (although it is worth noting that the z dependence is correctly predicted by 
the barrier crossing model). A speculative interpretation of this tweak, which we will defer for 
future work, is as follows. In the barrier crossing model, the second-order halo bias is given by 
62 = (^ 3 — 3i/)/(<5 c ctm)- It is tempting to conjecture that the expression in brackets in Eq. (50) 
is generally equal to (<5 c CAf&2)) and interpret our "tweak" to the Edgeworth prediction (49) as 
perturbing the relation between b\ and 62 > relative to the barrier crossing model. This opens up 
the possibility of directly measuring the second-order bias and determining f3 g directly. To study 
the viability of this idea, one would need to compare (3 g in simulation to some other estimate of 
second-order halo bias, such as the halo bispectrum in the squeezed limit. 



5.4 An important caveat 

There is an important caveat when using Eq. (50), or indeed any fitting function for the g^L bias, to 
constrain g^L from real data. It is tempting to compute j3 g by simply plugging the observed bias b\ 
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Figure 3: Scale-dependent g^L bias coefficient j3 g as a function of redshift z and halo bias b\, 
showing excellent agreement between our final analytic result (Eq. (50), dashed curves) and ^-body- 
simulations (error bars). 



and redshift z into Eq. (50). (Since the 2-dependence is very mild, a rough estimate for the redshift 
suffices.) However, we have only shown that this procedure is correct in the limit of a narrow bin in 
halo mass and redshift, and a real tracer population will be a weighted average over M and z. 

For example, consider the case in which the "tracers" are the dark matter particles themselves, 
i.e. each halo is weighted in proportion to its mass (assuming all mass is in halos). This tracer 
population has bias b\ = 1 (for the trivial reason that we are back to the dark matter field), so 
straightforward use of Eq. (50) would suggest that /3 g « —0.00025. (This value would make the 
low- A; power spectrum a reasonably sensitive probe of gNL-) In fact, the true /3 g of this tracer is 
zero, since the matter power spectrum P mm (fc) does not contain a term proportional to gNL/ce(k). 
This example shows that the true g^L bias of a tracer population can differ significantly from the 
value obtained by straightforward use of Eq. (50). In general, the g^L bias will depend on the full 
HOD (halo occupation distribution) of the tracer population, not only on the Gaussian bias b\. 

One popular approach to modeling the HOD is to assume that halos below some minimum mass 
M m ; n do not host tracers, whereas the mean number of tracers in a halo of mass M > M m i n is 
proportional to the total mass M. For reference, we give a fitting function for the g^i bias for this 
HOD: 



P 9 



K3 



0.4(i/ - 1) + 1.5(i/ - l) 2 + 0.6(z^ - 1) 



(51) 



where for purposes of this equation, K3 and v are defined as functions of the observables b\ and z 



7 Note that there is no analogous caveat in the /nl case. Because the relation /3f = 2d c (bi — 1) is linear, it applies 
to both a tracer population which is narrowly selected in (M, z) and to a population which is an arbitrary weighted 
average over (M,z). 
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by Eqs. (46), (48) above. 

Eq. (51) applies to a mass-weighted population of halos above M m \ n , whereas Eq. (50) applies to 
a population which is narrowly selected in mass. The two agree for b\ > 2.5, suggesting that HOD 
dependence is small in practice for highly biased samples, but disagree qualitatively for b\ < 2.5. 
For example, the g^L bias f3 g changes sign at b\ « 2.1 for the narrowly selected sample (Eq. (50)), 
whereas f3 g is always positive for the mass- weighted sample (Eq. (51)). 

Our perspective is that, in order to obtain g^L constraints which are robust to HOD modeling 
uncertainty, one should use highly biased samples {b\ > 2.5), where this uncertainty will be mini- 
mized. Samples which are not highly biased do not give robust constraints; for example, a tracer 
population with b\ = 1.8 can have a g^L bias f3 g which is negative, zero, or positive, depending on 
the HOD. 

For highly biased samples, it is useful to make the following observation: the g^L bias 0^ which 
is obtained from straightforward use of Eq. (50) is always less than the true g^L bias /3* me . 8 This 
follows from positivity of the second derivative d 2 f3 g /db\. It follows that a g^L constraint obtained 
using Pg is always valid, but slightly overestimates the statistical error that could be obtained if 
/3* me were known. This effectively treats HOD uncertainty as an extra source of systematic error. 

6 Discussion 

We have computed large-scale halo bias for non-Gaussian initial conditions, using two analytic 
frameworks: the peak-background split formalism (§3) and a barrier crossing model (§4), finding 
agreement between the two. Although our emphasis has been on the constant- /jvl and const ant-gwL 
models, our calculational machinery should apply to more general non- Gaussian initial conditions. 

The peak-background split formalism is simpler and also suggests a simple physical picture of 
non-Gaussian cosmologies on large scales. In an /jvl cosmology, the amplitude A$ of the initial 
fluctuations is not spatially constant, but is proportional to (1 + 2/jvl^z)- Thus, A$ has fluctuations 
on large scales which are 100% correlated with the long-wavelength potential, generating halo bias 
of the form (f3 f f n l / a{k)) . In a g^L cosmology, the small-scale skewness is nonzero and proportional 
t° (onl^i), leading to halo bias of the form ((3 g gNL/ 'a(k)). The peak-background split argument is 
very useful for generating universal relations such as (3 g = 3<9(logn) /df^L, which are "weak" in the 
sense that the RHS has not been expressed in terms of observable quantities, but have the advantage 
of being exact (as can be seen by comparing the two sets of errorbars in Fig. 2). 

The barrier crossing model generates all terms in the large-scale bias, including terms such as 
b\f and b\ g which are easy to miss, by a purely algorithmic calculational procedure. In addition, 
the barrier crossing model generates "strong" forms of the bias coefficients (e.g. the Edgeworth 
expression (40) for (3 g ), which are closed- form expressions in M and z. However, these expressions are 
not exact because the barrier crossing model is approximation to the true process of halo formation. 

8 This statement assumes that the probability that a halo hosts a tracer is a function only of the mass and redshift. If 
the probability depends strongly on additional variables such as merger history, triaxiality, etc. then this will generate 
additional contributions to /3 g , in analogy to the /nl case [28,69]. In principle, selection biases to f3 g can be addressed 
by folding the selection into the mass function when computing d(\ogn)/dfNL, but detailed study is beyond the scope 
of this paper. 
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To obtain a "bottom line" expression for the scale-dependent g^L bias f3 g in terms of redshift z 
and Gaussian bias z, we found it necessary to tweak slightly the b\ dependence of the Edgeworth 
prediction, arriving at the expression (50) which agrees very well with simulations. The caveat is 
that Eq. (50) applies only to a halo population which has been selected in a narrow halo mass 
and redshift range. In principle, one can calculate f3 g for a tracer population by multiplying by 
the halo occupation distribution and integrating over mass and redshift. In practice, the HOD 
is not known precisely and we have argued in §5.4 that the best approach is to only use highly 
biased populations (6 > 2.5) for constraining gNL- Since j3 g is a rapidly increasing function of b\, 
this strategy makes sense both from the perspective of minimizing statistical errors, and systematic 
errors due to HOD uncertainty. In data analysis, it may be useful to impose cuts which increase the 
mean halo bias at the expense of reducing the number of tracers. Another advantage of subdividing 
tracer populations is that this may permit f^L and qnl to be constrained simultaneously (with a 
single tracer population, the two are degenerate). 
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A Barrier model calculations 

In this appendix, we give details of the calculation of the halo mass function and large-scale bias 
(Eqs. (34)-(40)) in the barrier crossing model, to first order in /jvx, qnl- 

First, consider evaluation of the integrals in Eqs. (30), (31). Primordial non-Gaussianity enters 
the calculation by perturbing the PDFs which appear from Gaussian distributions. This perturbation 
can be written down explicitly using the Edgeworth expansion, which represents the PDF as a power 
series in cumulants. The Edgeworth expansion for the 1-variable PDF p(5' M ) is: 

where we have kept terms of first order in Jnl, 9nl- We can now compute po by plugging into the 
definition (30): 

Armed with this expression, it is easy to compute n{M) = —2p m /M(dpo/dM), obtaining the form 
of the mass function in Eq. (34). 
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Moving on to the 2-variable PDF p(S\\ n , S' M ), the Edgeworth expansion is: 

/ 

°°hnO0 M mn m.n. O0 \m O0 M 

\ m+n>3 




2ira\ in aM 

where <7ii n = (Sf^) 1 ^ 2 and the cumulant K m ^ n is defined by: 9 



p(S\m,8' M ) = exp 
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we compute £o( r ) by integrating Eq. (31) term by term, obtaining: 



= /0 v, /9 «i,i(M,r) + — =— ^+^^ fl^zM 



(2tt) 1 /2 



(54) 



°lin°M 

Note that the cumulant n n (M) defined previously in Eq. (7) is equal to Ko jn (M, r). 
Keeping the first few terms in the Edgeworth expansion: 10 

i / sl s' 2 ^ 



i^W|, 3M + ^M^MM1 H4{U) ) (57) 



To make further progress, we convert the correlation function to a power spectrum Po(k) = J d 3 r e* k ' r £o 
and keep only the leading behavior of each term in the long-wavelength limit k — > 0. 

d 3 q <i 3 q' 1 

P(k) 



[ d 3 re tk ' r Kl , 2 (M,r) = [ ^^W M (q)W M (q f ) <<5(k)«5(q)<5(-q')> 

(58) 



(Tun a(A:) 



9 A technical point: an n is formally infinite, but it will cancel from the final results in Eqs. (36)-(40). One could 
make <rii n finite by introducing a smoothing scale R for the matter field, and take the limit R — > at the end of the 
calculation. 

10 The choice of terms to keep was dictated by the following considerations. Only terms with precisely one <5ii n 
derivative will give nonzero contributions to the integral dSn n <5ii n Jj(<5iin, 8' M ) appearing in £ m h(r ), so we have only 
kept these terms. (Terms with two or more derivatives would contribute to the halo-halo correlation function ^hh(r), 
so they may be relevant for halo stochasticity.) We have also omitted terms whose leading contribution is second-order 
or higher in /nl and gNL- 
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/ d 3 re^ li3 (M,r) = -J^- f q 'f W M ( q )W M ( q ')W M (q") 
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a(g)a(g') 

%iVL 4 1) (M)f^g') (59) 



dun 3 \a(k) 

where "— >" denotes the k — > limit, and we have used Eq. (10) to simplify the last line. Putting 
this together, we find the following expression for Po(k) in the k — > limit: 



Po(k) = 



(2^ [^7 ^ + L ~6 3 ( ^) + gsL^Htiy) 

a{k) 2 a{k) 

The halo bias in a narrow mass range is given by the derivative: 



(60) 



_ dP (k)/dM 

b{k) ~ (d P0 /dM)P(k) + 1 (61) 

where the converts Lagrangian to Eulerian bias. Plugging in the forms of po, Po in Eqs. (53), (60), 
a long but straightforward calculation now gives the halo bias in the form given in the text (Eqs. (35)- 
(40)). 
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